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Abstract. On finite structures, there is a well-known connection between the expressive 
^ ' power of Datalog, finite variable logics, the existential pebble game, and bounded hypertree 

duality. We study this connection for infinite structures. This has applications for constraint 



satisfaction with infinite templates. If the template F is CLi-categorical, we present various 
QQ ' equivalent characterizations for whether the constraint satisfaction problem (CSP) for F can 

be solved by a Datalog program. We also show that CSP(-r) can be solved in polynomial time 
for arbitrary w-categorical structures F if the input is restricted to instances of bounded 
tree-width. Finally, we prove universal-algebraic characterizations of those oj-categorical 
templates whose CSP has Datalog width 1, and for those whose CSP has strict Datalog 
i width k. 

tyj ' An extended abstract of this paper appeared in the proceedings of the 23rd International 

O ■ Symposium on Theoretical Aspects of Computer Science (STACS'06) [6]. 
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In a constraint satisfaction problem we are given a set of variables and a set of constraints on these 

■ variables, and want to find an assignment of values from some domain D to the variables such 
00 I that all the constraints are satisfied. The computational complexity of the constraint satisfaction 

■ problem depends on the constraint language that is used in the instances of the problem. For finite 
domains D, the complexity of the constraint satisfaction problem attracted a lot of attention in 

, ■ recent years [2,9,15,17,18,21,26,29,30,32,37]; this Hst covers only a very small fraction of relevant 

rS ' publications, and we refer to a recent survey paper for a more complete account [8]. 

' Constraint satisfaction problems where the domain D is infinite have been studied in Artificial 

Intelligence and the theory of binary relation algebras [19,34], with applications for instance in 
temporal and spatial reasoning. Well-known examples of such binary relation algebras are the 
point algebra, the containment algebra, Allen's interval algebra, and the left linear point algebra. 
The corresponding constraint satisfaction problems received a lot of attention; see [14,19,27,34] 
and the references therein. 

Constraint satisfaction problems can be modeled as homomorphism problems [21]. For detailed 
formal definitions of relational structures and homomorphisms, see Section 2. Let F he a (finite 
or infinite) structure with a finite relational signature t. Then the constraint satisfaction problem 
(CSP) for F is the following computational problem. 

csp(r) 

INSTANCE: A finite r-structure S. 
QUESTION: Is there a homomorphism from S to F7 

The structure F is called the template of the constraint satisfaction problem CSP(I^). For 
example, if the template is the dense linear order of the rational numbers (Q, <), then it is easy 
to see that CSP(i^) is the well-known problem of digraph-acyclicity. 
i 



Datalog. Datalog is the language of logic programs without function symbols, see e.g. [20,32]. 
Datalog is an important algorithmic tool to study the complexity of CSPs. For constraint satis- 
faction with finite domains, this was first investigated systematically by Feder and Vardi [21]. For 
CSPs with infinite domains, one of the most studied algorithms are the arc- consistency and the 
path- consistency algorithm, which can also be formulated by Datalog programs. 

Let r be a relational signature; the relation symbols in r will also be called the input relation 
symbols. A Datalog program consists of a finite set of Horn clauses Ci , . . . , Cfc (they are also called 
the rules of the Datalog program) containing atomic formulas with relation symbols from the 
signature t, together with atomic formulas with some new relation symbols. These new relation 
symbols are called IDBs (short for intensional database). Each clause is a set of literals where 
at most one of these literals is positive. The positive literals do not contain an input relation. 
Before we give formal definitions of the semantics of a Datalog program in Section 2, we show an 
instructive example. 



Here, the binary relation edge is the only input relation, tc is a binary relation computed 

by the program, and false is a 0-ary relation computed by the program. The Datalog program 
computes with the help of the relation tc the transitive closure of the edges in the input relation, 
and derives false if and only if the input (which can be seen as a digraph defined on the variables) 
contains a directed cycle. In general, we say that a problem is solved by a Datalog program, if 
the distinguished 0-ary predicate false is derived on an instance of the problem if and only if the 
instance has no solution. This will be made precise in Section 2. 

Wc say that a Datalog program 7J has width (l,k), < I < k, if it has at most I variables 
in rule heads and at most k variables per rule (we also say that 11 is an {I, k)-Datalog program). 
A problem is of width {l,k), if it can be solved by an (Z, A;)-Datalog program. The problem of 
acyclicity has for instance width (2,3), as demonstrated above. A problem has width I if it is of 
width {I, k) for some k > I, and it is of bounded width, if it has width I for some I > 0. It is easy 
to see that all bounded width problems are tractable, since the rules can derive only a polynomial 
number of facts. It is an open question whether there is an algorithm that decides for a given finite 
template T whether CSP(r) can be solved by a Datalog program of width {I, k). Similarly, we do 
not know how to decide width I, or bounded width, with the notable exception of width one [21] 
(see also [18]). 

The vast majority of the known results regarding Datalog programs only takes the number of 
variables per rule into account. In this work, we are interested in capturing a finer distinction, 
in which the number of variables in the head of the rule plays an important role. The double 
parameterization where we consider both the number I of variables per head and the number k 
of variables per rule, is less common, but more general, and has already been considered in the 
literature on constraint satisfaction and Datalog [21]. 

For finite templates F, it has been shown that there is a tight connection between the expressive 
power of Datalog, finite variable logics, the so-called existential fc-pcbble game, and bounded 
hypertree duality; these concepts will be introduced in Section 2 and the mentioned connection 
will be formally stated in Section 3. 

Results. We study the connection between the expressive power of Datalog, finite variable logics, 
the existential pebble game, and bounded hypertree duality for infinite structures F. We show 
that this connection fails for general infinite structures (Section 3), but holds if F is uj- categorical 
(Section 4.3); the concept of w-categoricity is of central importance in model theory and will be 
introduced in Section 4.1. It is well-known that all the constraint satisfaction problems for the 
binary relation algebras (and their fragments) mentioned above and many problems in temporal 
and spatial reasoning can all be formulated with w-categorical structures. 



tc{x,y) 
tc{x,y) 
false 



edge{x,y) 
tc{x,u), tc{u,y) 
tc{x, x) 



Our results on the connection between bounded hypertree duality and Datalog have appli- 
cations for constraint satisfaction with w-categorical templates. We show that CSP(r') can be 

solved in polynomial time if 7^ is cj-catcgorical and the input is restricted to instances of bounded 
tree- width (in fact, it suffices that the cores of the instances have bounded tree- width). 

We also investigate which constraint satisfaction problems can be solved with a Datalog pro- 
gram in polynomial time, if no restriction is imposed on the input instances (Section 6). In partic- 
ular, we prove a characterization of constraint satisfaction problems with w-categorical templates 
r having width (1, k), generalizing a result from [18]. 

To obtain this result, we show that every problem that is closed under disjoint unions and has 

Datalog width one can be formulated as a constraint satisfaction problem with an w-catcgorical 
template (Section 5); here we apply a recent model-theoretic result of Cherlin, Shclali. and Shi [12]. 
Another important tool to characterize the expressive power of Datalog for constraint satisfaction 
is the notion of canonical Datalog programs. This concept was introduced by Fedcr and Vardi for 
finite templates; we present a generalization to tj-categorical templates. We prove that a CSP with 
an w-categorical template can be solved by an {I, fc)-Datalog program if and only if the canonical 
{I, fc)-Datalog program for F solves the problem (Section 4.2). 

A special case of width 1 problems are problems that can be decided by establishing arc- 
consistency (sometimes also called hyperarc- consistency), which is a well-known and intensively 
studied technique in artificial intelligence. We show that if a constraint satisfaction problem with 
an w-categorical template can be decided by establishing arc-consistency, then it can also be 
formulated as a constraint satisfaction problem with a finite template (Section 6). 

Finally, we characterize strict width I, a notion that was again introduced for finite templates 
and for Z > 2 in [21]; for a formal definition see Section 6. Jeavons et al. [29] say that in this 
case establishing strong k-consistency ensures global consistency. For finite templates, strict width 
I can be characterized by an algebraic closure condition [21,29]. In Section 7 we generalize this 
result to w-categorical templates F with a finite signature, and show that CSP(r') has strict width 
I if and only if for every finite subset ^4 of r' there is an Z + 1-ary polymorphism of F that is 
a near-unanimity operation on A, i.e., it satisfies the identity f{x, . . . ,x,y,x, . . . ,x) = x for all 
x,y € A from the domain. 



2 Definitions and basic facts 



A relational signature r is a (here always at most countable) set of relation symbols Ri, each 
associated with an arity ki. A (relational) structure F over relational signature t (also called t- 
structure) is a set Dr (the domain) together with a relation Ri C D^p for each relation symbol of 
arity fcj. If necessary, we write R'" to indicate that we are talking about the relation R belonging 
to the structure F . For simplicity, we denote both a relation symbol and its corresponding relation 
with the same symbol. For a r-structure F and i? S r it will also be convenient to say that 
i?(ui, . . . , Uk) holds in F iff {u\, . . . , Ufc) € R. We sometimes use the shortened notation x for a 
vector .Ti, . . . , Xn of any length. If we add relations to a given r-structure 7^, then the resulting 
structure F' with a larger signature r' D r is called a t' -expansion of F, and F is called a T-reduct 
of F'. 

The disjoint union of two r-structures F and F' is a r-structure that is defined on the disjoint 
union of the domains of F and F' . A relation holds on vertices of the disjoint union if and only 
if it either holds in F ov m F' . A r-structure is called connected iff it is not the disjoint union of 
two r-structures with a non-empty domain. 

The Gaiffman graph (sometimes also called the shadow) of a relational structure A is a graph 
on the vertex set wi, . . . , where two distinct vertices Vk and vi are adjacent if there is a relation 
in A that is imposed on both Vk and vi, i.e., there is a relation R such that A satisfies R{vi.^ ,Vi^) 
and k, I e {ii, . . . , ij}. 



2.1 Homomorphisms 



Let r and F' be r-structures. A homomorphism from F to F' is a function / from to Dp' 
such that for each n-ary relation symbol i? in r and each n-tuple a = {ai, . . . ,an), if a G R^, 
then (/(fli), . . . , f{an)) £ ■ In this case wc say that the map / preserves the relation R. Two 
structures A and F2 are called homomorphically equivalent, if there is a homomorphism from Fi 
to 7^2 and a homomorphism from 7^2 to Fi . 

A strong homomorphism f satisfies the stronger condition that for each n-ary relation symbol 
in T and each n-tuple a,a€ R^ if and only if (/(ai), . . . , /(a„)) G R^ . An embedding of a F in 
F' is an injective strong homomorphism, an isomorphism is a surjective embedding. Isomorphisms 
from F to F are called automorphisms. 

If F and are structures of the same signature, with Dp Q Dr', and the inclusion map is an 
embedding, then we say that F' is an extension of F, and that F a restriction of F' . 

A partial mapping h from a relational structure A to a relational structure B is called a partial 
homomorphism (from A to -B) if /i is a homomorphism from a restriction of A to -B. As usual, 
the restriction of a function /i to a subset 5 of its range is the mapping h' with range S where 
h'{x) = h{x) for all x £ S; his called an extension of h' . 

2.2 First-order logic 

First-order formulae ip over the signature r (or, short, T-formulae) are inductively defined using 
the logical symbols of universal and existential quantification, disjunction, conjunction, negation, 
equality, bracketing, variable symbols and the symbols from t. The semantics of a first-order 
formula over some r-structure is defined in the usual Tarskian style. A r-formula without free 
variables is called a r-sentence. We write r' ^ <^ iff the r-structure r' is a model for the r-sentence 
95; this notation is lifted to sets of sentences in the usual way. A good introduction to logic and 
model theory is [28]. 

As usual, we can use first-order formulas over the signature r to define relations over a given 

r-structure F: for a formula ip{xi, . . . ,Xk) where Xi, . . . ,Xk are the free variables of if the corre- 
sponding relation R is the set of all fc-tuples {ti,. . . ,tk) G Dp such that ip{ti, ■ ■ ■ ,tk) is true in 
F. 

A first-order formula ip is said to be primitive positive (we say ip is a pp-formula, for short) iff 
it is of the form 3x{tpi{x) A • • • A ipk{x)) where (fi, . . . ,(pk are atomic formulas (which might be 
equality relations of the form x = y). 

2.3 Canonical queries 

A basic concept to link structure homomorphisms and logic is the canonical conjunctive query (p^ 
of a finite relational structure A, which is a first-order formula of the form 3vi, . . . ,Vn-4>iA- ■ -Atpm, 
where vi,. . . ,Vn are the vertices of A, and {tpi, . . . , ipm} is the set of atomic formulas of the form 
R{vi^ , ■ ■ ■ , ) that holds in A. 

It is a fundamental property of the canonical query cp^ that (p^ holds in a structure F if and 
only if there is a homomorphism from ^4 to r" [11]. 

2.4 Finite variable logics 

Let < I < k he positive integers. The class 3Z'^^ is defined as the class of formulas that have 
at most k variables and are obtained from atomic formulas using infinitary conjunction, infinitary 
disjunction, and existential quantification only. The class 1J;,>q is denoted by 3^^^^. 

We want to bring the parameter I into the picture, and define the following refinement of 3£;^ 
A conjunction /\ is called l-bounded if tf' is a coUecion of ^ formulas tp that are quantifier-free 
or have at most I free variables. Similarly, an disjunction is called l-bounded if tf' is a collecion 
of 3£^;^ formulas ip that are quantifier- free or have at most I free variables. The set of 3L''^^ 
formulas is defined as the restriction of 3£;^^^ obtained by only allowing infinitary i-bounded 



conjunction and l-bounded disjunction instead of full infinitary conjunction and disjunction. Note 
that Uo<i<fe ^'^oow equals 3£^^^. The logic 3£^^^ was introduced (under a different name) by 
Kolaitis and Vardi as an existential negation-free variant of well-studied infinitary logics to study 
the expressive power of Datalog [31]; in subsequent work, they used the name 3^^^^ to denote 
this logic [32] and we follow this convention. 

We denote by i'"'^ the logic 3£j^^^ without disjimctions and with just finite ^-boimded conjunc- 
tions. In other words, a formula in L''*^ is composed out of existential quantification and finitary 
Z-bounded conjunction, and uses only k distinct variables. The language L'^, where only the pa- 
rameter k, but not the parameter I < k is specified, has been studied for example by Kolaitis and 
Vardi [32] and later by Dalmau, Kolaitis, and Vardi [16]. 

2.5 Datalog 

We now formally define Datalog. Our definition will be purely operational; for the standard se- 
mantical approach to the evaluation of Datalog programs see [20,32]. A Datalog program is a finite 
set of Horn clauses, i.e., clauses of the form tp :— (/>i, . . . , where I > and where ■0, ^i, . . . , 
are atomic formulas. The formula ip is called the head of the rule, and . . . ,(f)i is called the body. 
For technical reasons, we assume that all variables in the head also occur in the body. The relation 
symbols occurring in the head of some clause are called intentional database predicates (or IDBs), 
and all other relation symbols in the clauses are called extensional database predicates (or EDBs). 
A Datalog program has width (l,k) if all IDBs are at most Z-ary, and if all rules have at most k 
distinct variables. 

We might use Datalog programs to solve constraint satisfaction problems CSP(r') for a tem- 
plate r with signature t as follows. Let 77 be a Datalog program whose set of EDBs is r, and 
let a be the set of IDBs of 71. We assume that there is one distinguished 0-ary intentional re- 
lation symbol false. Now, suppose we are given an instance S of CSP(7^). An evaluation of 77 
on S proceeds in steps z = 0, 1, . . . . At each step i we maintain a (t U (T)-structure 5' ; for every 

i > and every relation symbol R £ a we have that 7?'^' C R^'^^ . Each clause of 77 is under- 
stood as a rule that may add tuples to relations in 5'+^, depending on 5*. Initially, we start with 
the expansion S*" of 5* where all symbols from a denote the empty relation. Now suppose that 
7?i(u}, . . . • • • • • • ,4,) hold in S\ and that 7?o (y? , . . . , J/^ J :- Ri{yl, . . . ,ylj, . . . , 

Ri{y{, . . . ,y\i) is a rule from 77, where = y*, if and only if = m*,. Then we add the tuple 
{u\, . . . , w^) to 7? in 5'+^, where = u*, if and only if = y*,. We also say that the Datalog 
program derives R{u\, . . . ,uf) from 7?i(?i}, . . . . . . ,Ri{u[, . . . ,u\^). 

The procedure stops if no new tuples can be derived. We say that 77 is sound for CSP(7^) 
if S does not homomorphically map to 7^ whenever 77 derives false on S. We say that 77 solves 
CSP(7^) if 77 derives false (i.e., adds the 0-ary tuple to the relation for the symbol false) on S if 
and only if S does not homomorphically map to 7"". 

2.6 The existential pebble game 

The existential fc-pebble game was studied in the context of constraint satisfaction for instance 
in [16,21,32]. As in [21], we study this game with a second parameter, and first define the existential 
{I, k)-pebble game. The usual existential fc-pebble game is exactly the existential {k — 1, A;)-pebble 
game in our sense. The second parameter is necessary to obtain the strongest formulations of our 
results. 

The game is played by the players Spoiler and Duplicator on (possibly infinite) structures A and 
B of the same relational signature. Each player has k pebbles, pi,. . . ,pk for Spoiler and qi,. . . ,qk 
for Duplicator. Spoiler places his pebbles on elements of A, Duplicator her pebbles on elements of 
B. Initially, no pebbles are placed. In each round of the game Spoiler picks k — I pebbles. If some of 
these pebbles are already placed on A, then Spoiler removes them from A, and Duplicator responds 
by removing the corresponding pebbles from B. Spoiler places the k — I pebbles on elements of 
A, and Duplicator responds by placing the corresponding pebbles on elements of B. Let ii,. . . ,im 



be the indices of the pebbles that are placed on A (and B) after the i-th round. Let ai^,. . . , 

, . . . , 6i„ ) be the elements of A (B) pebbled with the pebbles pi^,. . . , (qt^ , ft™ ) after 
the i-th round. If the partial mapping h from A to B defined by h{ai.) = hi., for 1 < j < m, is 
not a partial homomorphism from A to B, then the game is over, and Spoiler wins. Duplicator 
wins if the game continues forever. 

We would like to characterize the situations where Duplicator can win the game, i.e., where 
Spoiler does not have a winning strategy. It turns out that in this case Duplicator can always play 
memoryless in the sense that the decisions of Duplicator are only based on the current position 
of the pebbles, and not the previous decisions of the game. This holds for a much larger class of 
pebble games, see [23]. 

Definition 1. A (positional) winnning strategy for Duplicator for the existential {l,k)-pebble 
game on A,B is a non-empty set "K of partial homomorphisms from A to B such that 

— % is closed under restrictions of its m,em,bers, and 

— for all functions h in % with |dom(/i)| = d < I and for all a\, . . . ,ak-d S A there is an 
extension h' €:'K of h such that h' is also defined on ai,. . . , au-d- 

2.7 Treewidth 

In what remains of the section we define the treewidth of a relational structure. As in [21], we 
need to extend the ordinary notion of treewidth for relational structures in such a way that we 
can introduce the additional parameter I. 

Let < ? < fc be positive integers. An {I, k)-tree is defined inductively as follows: 

— A A:-clique is a (/, A;)-tree 

— For every (Z, A;)-tree G and for every Z-clique induced by nodes v\,...,vi in G, the graph G' 

obtained by adding k — I new nodes uj+i, . . . , ffe to G and adding edges {vi, Vj) for all i ^ j 
with i G {1, . . . , k}, j € {I + I, . ■ . , k} (so that wi, . . . , wj. forms a fc-clique) is also a {I, fc)-tree. 

A partial {I, k)-tree is a (not necessarily induced) subgraph of an (Z, A;)-tree. 

Definition 2. Let < I < k and let t be a relational signature. We say that a t -structure S has 
treewidth {l,k) if the Gaiffman graph of S is a partial {l,k)-tree. 

If a structure has treewidth (fc, fc + 1) we also say that it has treewidth k, and it is not difficult 
to see that these structures are precisely the structures of treewidth k in the sense of [32]. It is 
also possible to define partial {I, fc)-trees by using tree-decompositions. 

Definition 3. A tree-decomposition of a graph G is a tree T such that 

1. The nodes ofT are sets of nodes ofG; 

2. Every edge of G is entirely contained in some node of T; 

3. If a node v belongs to two nodes x, y of T it must also be in every node in the unique path 
from X to y. 

A tree-decomposition T is said to be of width (l, k) if every node of T contains at most k 
nodes of G and the intersection of two different nodes of T has size at most I. The following is a 
straightforward generalization of a well-known fact for single parameter k, and the proof can be 
obtained by adapting for instance the proof given in [3] . 

Proposition 1. A graph is a partial {I, k)-tree if and only if it has a tree- decomposition of width 
{l,k). 

It was shown in [32] that the canonical query for a structure 5* of tree- width k can be expressed 
in the logic 1,'^+^. We show an analogous statement for both parameters I and k. 



Lemma 1. Let A be a finite structure of treewidth {l,k). Then the canonical query cp"^ for A is 
expressible in L^''^. 

Proof. Let A be a finite structure of treewidth {l,k), let G be its GaifFman graph, and let T be a 
tree-decomposition of G. Let us view T as a rooted tree with root t ~ {oi, . . . ,ak'}, k' < k. We 
shall show by structural induction on T that there exists a formula (f)'^{yi, . . . ,yk') in L^''^ with 
free variables yi, . . . ,yk' such that for every structure B and elements bi, . . . ,bk> in B the following 
two sentences are equivalent: 

(1) The partial mapping from A to B that maps Ui to 6^ for 1 < i < k' can be extended to a 
homomorphism from A to B; 

(2) Q{bi,...,bk') holds inB. 

The base case is when the tree contains only one node t. In this case (f)^ is obtained by 
removing the existential quantifier in the canonical conjunctive query of A. For the inductive step, 
let ti, ... ,tm be the children of the root t in T. Consider the m subtrees Ti, . . . , T„ of T obtained 
by removing t. For every i = 1, . . . ,m we root Ti at ti and consider the substructure Ai of A 
induced by the set of all nodes of A contained in some node of T, . Then, Ti is a tree-decomposition 
of Ai and the induction hypothesis provides a formula for which (1) and (2) are equivalent. 
Let (j)"^{yi, . . . , yk') be the formula /\ ^ where is the following set of formulas: 

(a) For each i = l,...,m the set ^ contains the formula obtained by existentially quantifying 
all free variables yj in cj)-^^ where aj ^ t. Note that the resulting formula has at most I free 
variables. 

(b) The set <P contains the conjuncts from the canonical query of the substructure of A induced 
by the nodes in t (as in the base case). 

To show that (1) and (2) are equivalent, let B be an arbitrary structure. By the properties 
of the tree-decomposition we know that h is a. homomorphism from A to B that maps to &j 
if and only if for all i = 1, . . . ,m the restriction of h to Ai is a homomorphism from Ai to B 
and the restriction of h to the elements of t is a partial homomorphism from A to i? as well. The 
former condition is equivalent to the fact that the assignment j/o- bi, satisfies every formula of 
included in (a). The latter condition is equivalent to the fact that the very same assignment 
satisfies the formula introduced in (b). □ 

3 General infinite structures 

In this section, we recall facts known shown for finite structures [16,31], and show that they fail 
if r has an infinite domain. The following theorem was proved by [31] for finite structures. 

Theorem 1. Let A, B be finite relational structures over the same signature r. The the following 
are equivalent. 

1. Duplicator has a winning strategy for the existential k-pebble game on A and B; 

2. All T-sentences in 3L^^ that hold in A also hold in B; 

3. Every finite T-structure whose core has treewidth k — 1 that homomorphically maps to A also 
homomorphically maps to B. 

We show that for infinite structures B it is in general not true that 3 implies 1. 
Proposition 2. There are infinite r-structures A and B such that 



— Duplicaior does not have a winning strategy for the existential 2-pebble game on A and B; 

— Every finite T-structure C of tree-width 1 that homomorphically maps to A also maps to B. 



Proof. Let S be a disjoint union of directed paths of length 1,2,. . . Consider A = . Every finite 
r-structure C of tree-width 1 is a finite oriented tree, and therefore homomorphically maps to A 
and to B. However, Spoiler clearly has a winning strategy. After Spoiler places his first pebble. 
Duplicator has to place his first pebble on a path of length l\nB. By walking with his two pebbles 
in one direction on the directed cycle A, Spoiler can trap Duplicator after I rounds of the game. □ 

Prom now on, we are interested in computational problems of the form CSP(r') for a countably 
infinite structure F. The following states results obtained in [16,21,31]. 

Theorem 2. Let F he a T-structure over a finite domain. Then the following statements are 

equivalent. 

1. For all finite T-structures A, if Duplicator has a, winning strategy for the existential k-pebble 
game on A and F , then A is in CSP{F). 

2. The complement of CSP{F) can be formulated in 3^^^^. 

3. For all finite r-structures A, if every finite r-structure C of tree-width k that homomorphically 
maps to A also maps to F , then A homomorphically maps to F . 

4. There is an obstructions set 3Nf, i.e., a set of finite structures of tree-width k — 1 such that a 
finite T-structure A is homomorphic to F if and only if no structure in K is homomorphic to 
A. 

Also Theorem 2 fails for structures F over an infinite domain. Intuitively, the reason is that 
the expressive power of infinitary disjunction is relatively larger for CSP(r') if F has an infinite 
domain. 

Proposition 3. There is a r-structure F over an infinite domain such that 

— the complement of CSP{F) can be formulated in BL'^^^; 

— there is a finite r-structure A such that Duplicator wins the existential 2-pebble game on A 

and F, but A is not in CSP{F). 

Proof. We choose F to be (Q, <), and let be the directed cycle on three vertices. Duplicator 
wins the existential 2-pebble game on and F, but there is no homomorphism from to F. 

The complement of CSP(Q; <) can be formulated in 3^"^^. Let <P be an ElZL^^-scntcncc that 
expresses that a structure contains copies of ({1, . . . , n}, <) for arbitrarily large n. The structures 
that are not in CSP(Q, <) are precisely the directed graphs containing a cycle; clearly, # holds 
precisely on those finite directed graphs having a cycle. □ 

4 Datalog for w-categorical structures 

The concept of w-categoricity is of central interest in model theory [10,28] . We show that many facts 
that are known about Datalog programs for finite structures extend to w-categorical structures. 

4.1 Countably categorical structures 

A countable structure F is called lu- categorical, if all countable models of the first-order theory 
of F are isomorphic to F. The following is a well-known and deep connection that shows that lo- 
categoricity of is a property of the automorphism group of 7^, without reference to concepts from 
logic (see [28]). The orbit of an n-tuple a from F is the set {a{a) j a is an automorphism of F}. 

Theorem 3 (Engeler, Ryll-Nardzewski, Svenonius; see e.g. [28]). The following properties 

of a countable structure F are equivalent: 

1. the structure F is oj -categorical; 

2. for each n>l, there are finitely many orbits of n-tuples in the automorphism group of F; 

3. for each n > 1, there are finitely many inequivalent first-order formulas with n free variables 
over F; 

4. A relation R is first- order definable in F if and only if it is preserved by all automorphisms of 
F. 



Examples. An example of an w-categorical directed graph is the set of rational numbers with the 
dense linear order (Q, <) [28]. The (tractable) constraint satisfaction problem for this structure 
is digraph acyclicity. Another important example it the universal triangle free graph f5. This 
structure is the up to isomorphism unique countable K^-ivee graph with the following extension 
property: whenever 5 is a subset and T is a disjoint independent subset of the vertices in then 
contains a vertex v S U T that is linked to no vertex in S and to all vertices in T. Since 
the extension property can be formulated by an (infinite) set of first-order sentences, it follows 
that id is w-categorical [28]. The structure -fd is called the universal triangle free graph, because 
every other countable triangle free graph embeds into Hence, CSP(f5) is clearly tractable. 
However, this simple problem can not be formulated as a constraint satisfaction problem with a 
finite template [21,37]. 

For w-categorical templates we can apply the so-called algebraic approach to constraint sat- 
isfaction [8,9,30]. This approach was originally developed for constraint satisfaction with finite 
templates, but several fundamental theorems also hold for w-categorical templates [7,25]. 

The following lemma states an important property of cj-categorical structures needed several 
times later. The proof contains a typical proof technique for a;-catcgorical structures. 

Lemma 2. Let F be a finite or infinite uj- categorical structure with relational signature t, and 
let A be a countable relational structure with the same signature r. If there is no homomorphism 
from A to F, then there is a finite substructure of A that does not homomorphically map to F. 

Proof. Suppose every finite substructure of A homomorphically maps to F. We show the contrapo- 
sition of the lemma, and prove the existence of a homomorphism from A to F. Let ai, 02, . . . be an 
enumeration of A. We construct a directed acyclic graph with finite out-degree, where each node 
lies on some level n > 0. The nodes on level n are equivalence classes of homomorphisms from the 
substructure of A induced by ai, . . . , a„ to r'. Two such homomorphisms / and g are equivalent, 
if there is an automorphism a of F such that fa = g. Two equivalence classes of homomorphisms 
on level n and n + I are adjacent, if there arc representatives of the classes such that one is a 
restriction of the other. Theorem 3 asserts that F has only finitely many orbits of /c-tuples, for 
all > (clearly, this also holds if F is finite). Hence, the constructed directed graph has finite 
out-degree. By assumption, there is a homomorphism from the structure induced by ai, a2, . . . , a„ 
to F for all n > 0, and hence the directed graph has vertices on all levels. Konig's Lemma asserts 
the existence of an infinite path in the graph, which which can be used to inductively define a 
homomorphism h from Z\ to J' as follows. 

The restriction of h to {ai, . . . ,an} will be an element from the n-th node of the infinite path in 
G. Initially, this is trivially true if h is restricted to the empty set. Suppose h is already defined on 
ai, . . . , a„, for n > 0. By construction of the infinite path, we find representatives hn and of 
the n-th and the n+ 1-st element on the path such that hn is a restriction of /i„+i. The inductive 
assumption gives us an automorphism f of F such that f{hn{x)) = h{x) for all x G {ai, . . . , a„}. 
We set /i(a„_|-i) to be /(ft,„+i(a„_|_i)). The restriction of ft, to ai, . . . , a„ will therefore be a member 
of the n + 1-st element of the infinite path. The operation / defined in this way is indeed a 
homomorphism from Zi to r'. □ 

4.2 Canonical Datalog programs 

In this section we define the canonical Datalog program of an w-categorical structure F. We will 
later prove in Section 4.3 that CSP(/^) can be solved by an {I, fc)-Datalog program if and only if 
the canonical (/, A;)-Datalog program solves the problem. 

For finite templates T with a relational signature r the canonical Datalog program for T 
was defined in [21]. This motivates the following definition of canonical Datalog programs for ui- 
categorical structures F. The canonical (l, k)-Datalog program for F contains an IDE for every at 
most Z-ary primitive positive definable relation in F. The empty 0-ary relation serves as false. The 
input relation symbols are precisely the relation symbols from t. 

Let F' be the expansion oi F by all at most l-aiy primitive positive definable relations in F. It is 
well-known that first-order expansions of F and hence also the structure F' are also w-categorical 



(see e.g. [28]). Theorem 3 asserts that there is a finite number of inequivalent logical implications 
^{x) of the form {3y{tpi{x,y) A • • • A ijjj{x,y)) — > R{x) in F' having at most k variables, where 
ijji, . . . ,tpj arc atomic formulas of the form i?i (zi ),..., i?j (zj ) for IDBs or EDBs Ri, . . . ,Rj and 
an IDB R. For each of these inequivalent implications, we introduce a rule 

R{x):-Ri{zi), . . .,Rj{zj) 

into the canonical Datalog program, if Vx.^{x) is valid in F' , in other words, if the pp-dcfinablc 
relation for which the IDB R was introduced is implied by 3y.ipi{x,y) A • • • A ipj{x,y) in F' . Since 
there are finitely many inequivalent implications ^, the canonical {I, fc)-Datalog program is finite. 

On a given instance a Datalog program can only derive a finite number of facts, which is 
polynomial in the size of the instance. Thus, Datalog programs can be evaluated in polynomial 
time. Observe that the final stage of the evaluation of the canonical Datalog program on a given 
instance gives rise to an instance S' of CSP(r''). 

The following is easy to see. 

Proposition 4. Let F be co -categorical. Then the canonical {I, k)-Datalog program for F is sound 

for CSP{F). 

Proof. We have to show that if the canonical (Z, A;)-Datalog program derives false on a given 
instance S, then S is unsatisfiable. Assume for contradiction that there is a homomorphism / : 
S F although the canonical (Z, fc)-program for F derives false on the instance S. The derivation 
tree of false corresponds via / to a set of valid implications in the template. Finally, the implication 
of false corresponds to an implication of the 0-ary empty relation in the template, a contradiction. 

□ 

4.3 Datalog for countably categorical structures 

The following theorem is the promised link between Datalog, the existential pebble game, finite 
variable logics, and hypertree duality for w-categorical structures. We present it in its most general 
form with both parameters I and k. The assumption of w-categoricity will be used for the transition 
from 2 to 3 (note that the canonical Datalog program is only defined for cj-catcgorical structures) . 

A finite relational structure 5 is a core if every endomorphism of S is an automorphism of S. 
It is easy to see that every finite relational structure is homomorphically equivalent to a core, and 
that this core is unique up to isomorphism. 

Theorem 4. Let F be a uu -categorical structure with a finite relational signature t, and let A be 
a finite r-structure. Then the following statements are equivalent. 

1. Every sound [l, k)-Datalog program for CSP(F) does not derive false on A. 

2. The canonical (l, k)-Datalog program for F does not derive false on A. 

3. Duplicator has a winning strategy for the existential {I, k)-pebble game on A and F. 

4-. All sentences in L^'^ that hold in A also hold in F . 

5. Every finite r-structure with a core of treewidth {I, k) that homomorphically maps to A also 
homomorphically maps to F. 

Proof. The implication from 1 to 2 follows from Proposition 4. 

To show that 2 implies 3, we define a winning strategy for Duplicator as follows. It contains 
all those partial mappings f : A ^ F with domain D of size at most k such that for every derived 
IDB R{xi, . . . , Xd), where xi,. . . ,Xd & F>, the tuple (/(xi), . . . , f{xd)) belongs to R. 

By construction, 5{ contains only partial homomorphisms and is non-empty (since false is not 
derived, "K contains the partial mapping with the empty domain). We shall prove that J{ has the 
{I, /c)-extension property. We will omit the easier proof that "K is closed under restrictions. Let h 
be a function with domain vi, . . . ,vi' of size at most I and let D = {wi, . . . , vi',vii+i, . . . , Wfc'} be 
a superset of {v\, . . .,vii} of size at most k. Let T be the fc'-ary relation that contains all those 



tuples {bi, . . . ,bk') & Dp such that {bi^ , . . . , G for every IDB R{vi^ ,Vi^) derived over 
the domain D. 

Consider the following rule with variables xi, . . . ^x^i of the canonical Datalog program. The 
body of the rule contains for each IDB predicate R{vi^ , . . . ,Vi^) derived over the domain D the 
atomic predicate R{xi^ , . . . ,Xi^). The head of the rule is S{xi, . . . ,xi') where is the projection 
of the relation T to the first I' arguments. 

The instantiation Xi ^ Vi, i = 1, . . . , k' , allows to derive S{vi^ . . . ,vii) by this rule, and, by the 
definition of ?C, . . . , h{vi')) belongs to S^. By the definition of S^, there exists fe^'+i, ■ ■ ■ ,bi.' 

such that {h{vi), . . . , h{vi'), bi'^i, . . . , 6^') belongs to T. Hence, if we extend h hy Vi ^ bi for i 
from I' + 1, . . . ,k' we obtain the desired function. 

Next, we show the implication from 3 to 4. The proof closely follows the corresponding proof 
for finite structures given in [32], with the important difference that wc have both parameters I 
and k in our proof, whereas previously the results have only been stated with the parameter k. 

Suppose Duplicator has a winning strategy !K for the existential {I, fc)-pebble game on A and 
r. Let (f) he a r-scntence from i'"'^ that holds in A. We have to show that cj) also holds in F. For 
that, we prove by induction on the syntactic structure of L^''^ formulas that 

if ip{vi, . . . ,Vm) is an L''*^ formula that is an ^-bounded conjunction 
or has at most I free variables (i.e., m < I), then for all /i e CK and all elements ai, . . . , from 
the domain of h, if A satisfies ipiai, . . . , am), then F satisfies ip{h{ai), . . . , h{am))- 

Clearly, choosing m = 0, this implies that (j) holds in F. 

The base case of the induction is obvious, since atomic formulas arc preserved under homo- 
morphisms. Next, suppose that '4}{v\, . . . , Vm) is an ^-bounded conjunction of a set of formulas SI/. 
Then each formula in ^ either has at most I free variables, or is quantifier-free. In both cases we 
can use the inductive hypothesis, and the inductive step follows directly. 

Assume that the formula ipivi, . . . , Vm) is of the form 3u\, . . . , Un- x{v\, ■ . ■ , Vm, Ui, . . . , 
Since V is from i'"'^, we know that n + m < k. If m > I, there is nothing to show. Otherwise, we 
choose X and n such that n is largest possible. Therefore, x is either an Z-bounded conjunction or 
an atomic formula. We will use the inductive hypothesis for the formula x('''i) ■ • ■ ,Vm,ui, . . . , Un)- 
Let /i be a homomorphism in "K. We have to show that if ai , . . . , Om are arbitrary elements from 
the domain of h such that A satisfies ■ ■ ■ , ^m), then F satisfies ip{h{ai), . . . , h{am))- 

Since A satisfies 3ui, . . . , Un-xi^i, • • • , dm), there exist am+i, ■ ■ ■ , am+n such that A satisfies 
x(ai, • • • , dm, o-m+i, ■ ■ ■ , dm+n)- Consider the restriction h* of h to the subset {ai, . . . , dm} of the 
domain of h. Because of the first property of winning strategies J{, the homomorphism h* is in 
J{. Since m < I, we can apply the forth property of IK to h* and dm+i, ■ ■ ■ , dm+n, and there are 
bi,. . . ,bn such that the extension h' of h* with domain {ai, . . . , dm+n} that maps dm+i to bi is 
in IK. By applying the induction hypothesis to x('-'ij • • • j ^m, ui, . . . , ?i„) and to h' , we infer that F 
satisfies x{h'{di), . . . , h'{am+n)), and hence F satisfies tp{h{ai), . . . , h{am))- 

4 implies 5. Let T be a finite r-structure T whose core T' has treewidth (l, k) such that T 
homomorphically maps to A. By Lemma 1 there exists an i'^'^-scntence (p such that (f) holds in a 
structure B if and only if T' homomorphically maps to B. In particular, (j) must hold in A. Then 5 
implies that (p holds in F, and therefore T' homomorphically maps to F. But then we can compose 
the homomorphism from T to T' and the homomorphism from T' to F, which shows the claim. 

We finally show that 5 implies 1. Assume 5, and suppose for contradiction that there is a sound 
{I, fc)-Datalog program iT for F that derives false on A. The idea is to use the 'derivation tree of 
false' to construct a r-structure S of tree- width (Z, k) that homomorphically maps to A, but not 
to F. 

The construction proceeds by induction over the evaluation of 77 on A. Suppose that Ro{yi, . . . , 
ul^) is a literal derived by 11 from previously derived literals . . . , . . . , Rsiyf, . . . ,yl ) 

on A. If Ri is an IDB, then we inductively assume that we have constructed for each derived 
literal Ri{y\, . . . ,y\,) a r-structure Si whose Gaifman graph is a subgraph of an (Z, fc)-tree Gi and 
which has distinguished vertices v\, . . . ,v\, that induce a clique in Gi. If Ri is an EDB, we create 
fresh vertices v\,...,v\., and define Si to be the following structure with vertices v\,...,v\.. The 



relation Ri in Si equals {{v\, . . . ,vl.)}, and all other relations in Si are empty. Clearly, vl,. . . ,vl., 
induce a clique in the Gaifman graph of Si, and the Gaifman graph is a partial {I, fc)-tree. 

Now, the structure 5*0 has the distinguished vertices , and is obtained from the r- 

structures Si,. . . ,Ss as follows. We start from the disjoint union of Si, . . . , Ss- When = then 
we identify Vj and for < i < s, 1 < j < h, and 1 < s < fcr (recall our assumption that all 
variables in the head also appear in the body). Since wc know by inductive assumption that the 
tree- width of 5*1, . . . , is {l,k), it is clear from this construction that the Gaifman graph Gq of 
So has tree-width {I, k) as well. Moreover, Vi, . . . , v^^ induce a fco-clique in Gq- Observe that if we 
would then apply the program U on Sq, and if 77 would derive Ri(vl, . . . ,v]^^), . . . , Rs{vl, . . . ,v^ ) , 
then it would also derive Ro{vi, . . . ,v'^^). In this fashion we proceed for all inference steps of the 
Datalog program. 

Let 5* be the resulting structure of trcewidth {l,k) for the final derivation of false. There is a 
homomorphism from S to A: each vertex v in S was introduced for a variable x in A, and if we 
map V to X the resulting mapping clearly is a homomorphism. Now, suppose for contradiction that 
there is a homomorphism h from S to F. Then the observation wc made in the last paragraph 
implies that the Datalog program derives false on S as well. Since the Datalog program is sound 
for CSP(r'), there cannot be a homomorphism from 5 to i"', a contradiction. □ 

4.4 Application to constraint satisfaction 

We discuss an important consequence of Theorem 4 with many concrete applications: wc prove that 
CSP(r') for w-categorical F is tractable if the input is restricted to instances of tree- width (/, fc). 
In fact, for the tractability result we only have to require that the cores of the input structures 
have bounded tree-width. The statement where we only require that the core of input structures 
has tree-width (/, k) is considerably stronger (also see [24]); the corresponding statement for finite 
structures and single parameter k has been observed in [16]. 

Corollary 1. Let F be an lo- categorical structure with finite relational signature r. Then every 
instance A of CSP{r) whose core has tree-width {l,k) can be solved in polynomial time by the 
canonical {I, k) -Datalog program. 

Proof It is clear that an (l, fc)-Datalog program can be evaluated on a (finite!) instance A of 
CSP(r') in polynomial time. If the canonical (/, A:)-Datalog program derives false on A, then, 
because the canonical Datalog program is always sound, the instance A is not homomorphic to 
F. Now, suppose that the canonical Datalog program docs not derive false on a finite structure A 
whose core has tree-width {I, k). Then, by Theorem 4, every r-structure whose core has tree- width 
{I, k) that homomorphically maps to A also homomorphically maps to F. This holds in particular 
for A itself, and hence A is homomorphic to F. □ 

The following direct consequence of Theorem 4 yields other characterizations of bounded Dat- 
alog width. 

Theorem 5. Let F he a u- categorical structure with a finite relational signature r. Then the 

following statements are equivalent. 

1. There is an {l,k)-Datalog program that solves CSP{F). 

2. The canonical {I, k) -Datalog program solves CSP{F). 

3. For all finite r-structures A, if Duplicator has a winning strategy for the existential {I, k)-pebble 
game on A and F, then A is in CSP{F). 

4. For all finite r-structures A, if all sentences in V"'^ that hold in A also hold in F, then A 
homomorphically maps to F . 

5. For all finite r-structures A, if every finite r-structure S of tree-width {I, k) that homomorphi- 
cally maps to A also homomorphically maps to F, then A homomorphically maps to F. 

6. There is a set J-i of finite structures of tree-width {I, k) such thai every finite r-structure A is 
homomorphic to F if and only if no structure in Ji is homomorphic to A. 



Proof. Suppose that an fc)-Datalog program 71 solves CSP(r), and let S be an instance of 
CSP(-r). If the canonical (Z, fc)-Datalog program derives false on S, then by Proposition 4 the 
structure S is not homomorphic to F. Otherwise, since 77 is sound, the implication from 2 to 1 
in Theorem 4 shows that the canonical (Z, fc)-Datalog program does not derive false on S as well. 
Hence, the canonical Datalog program solves CSP(r'), and this is the proof of the implication from 
1 to 2. The implications from 2 to 3, 3 to 4, 4 to 5, and 5 to 1 are straightfoward consequences of 
Theorem 4. 

To show that 5 implies 6, let K be the set of all those structures of tree-width (Z, k) that does 

not homomorphically map to 7^. Let A is a finite r-structurc. If A homomorphically maps to 7^, 
then clearly there is no structure C in K that maps to A, because then C would also map to 7^, a 
contradiction to the definition of K. Conversely, suppose that no structure in N homomorphically 
maps to A. In other words, every structure that homomorphically maps to A also maps to F. 
Using 5, this implies that A homomorphically maps to 7^. 

Finally, 6 implies 1. Let K be as in 5, and let A he a finite r-structure such that every finite 
T-structurc S of tree-width (Z, k) that homomorphically maps to A also homomorphically maps to 
7^. In particular, no structure in Jsf homomorphically maps to A. Therefore, A homomorphically 
maps to F. □ 



5 1-Datalog, MMSNP, and constraint satisfaction 

In this section we show that every class of structures with Datalog width one can be formulated 
as a constraint satisfaction problem with an w-categorical template. A Datalog program of width 
one accepts a class of structures that can be described by a sentence of a fragment of existential 
second order logic called monotone m,onadic SNP without inequalities (MMSNP). We show that 
every problem in MMSNP can be formulated as the constraint satisfaction problem for an u- 
categorical template. 

An SNP sentence is an existential second-order sentence with a universal first-order part. The 
first order part might contain the existentially quantified relation symbols and additional relation 
symbols from a given signature r (the input relations). We shall assume that SNP formulas are 
written in negation normal form, i.e., the first-order part is written in conjunctive normal form, 
and each disjunction is written as a negated conjunction of positive and negative literals. The class 
SNP consists of all problems on r-structures that can be described by an SNP sentence. 

The class MMSNP, defined by Feeder and Vardi, is the class of problems that can be described 
by an SNP sentence with the additional requirements that the existentially quantified relations 
are monadic, that every input relation symbol occurs negatively in the SNP sentence, and that it 
does not contain inequalities. Every problem in MMSNP is under randomized Turing reductions 
equivalent to a constraint satisfaction problem with a finite template [21]; a deterministic reduc- 
tion was recently announced by Kun [33]. It is easy to see that MMSNP contains all constraint 
satisfaction problems with finite templates. Thus, MMSNP has a dichotomy if and only if CSP 
has a dichotomy. 

It is easy to see that (1, fc)-Datalog is contained in MMSNP: We introduce an existentially 
quantified unary predicate for each of the unary IDBs in the Datalog program. It is then straight- 
forward to translate the rules of the Datalog program into first-order formulas with at most k 
first-order variables. We now want to prove that every problem in MMSNP can be formulated 
as a constraint satisfaction problem with a countably categorical template. In full generality, this 
cannot be true because constraint satisfaction problems are always closed under disjoint union 
(a simple example of a MMSNP problem not closed under disjoint union is the one defined by 
the formula Va;, y -^{P{x) A Q{x))). Hence we shall assume that we are dealing with a problem in 
MMSNP that is closed under disjoint union. 

To prove the claim under this assumption, we need a recent model-theoretic result of Cherlin, 
Shelah and Shi [12]. Let N be a finite set of finite structures with a relational signature r. In this 
paper, a r-structure A is called T^-free if there is no homomorphism from any structure in K to 
A. A structure 7^ in a class of countable structures C is called universal for 6, if it contains all 
structures in C as an induced substructure. 



Theorem 6 (of [12]). Let N be a finite set of finite connected T-structures. Then there is an ui- 
categorical universal structure A that is universal for the class of all countable "N-free structures. 

Cherlin, Shelah and Shi proved this statement for (undirected) graphs, but the proof does 
not rely on this assumption on the signature, and works for arbitrary relational signatures. The 
statement in its general form also follows from a result in [13] . We use the w-categorical structure 

A to prove the following. 

Theorem 7. Every problem in MMSNP that is closed under disjoint unions can be formulated as 

CSP{r) with an uj- categorical template F . 

Proof. Let be a MMSNP sentence with input signature r whose set M of finite models is 
closed under disjoint unions. We have to find an w-categorical r-structure F, such that M equals 
CSP(i~'). Recall the assumption that is written in negation normal form. Let Pi, . . . , be the 
existential monadic predicates in By monotonicity, all such literals with input relations are 
positive. For each existential monadic relation Pj we introduce a relation symbol P/, and replace 
negative literals of the form -iPi(x) in <l> by P/(a;). We shall denote the formula obtained after this 
transformation by Let r' be the signature containing the input relations from r, the existential 
monadic relations Pi, and the symbols P/ for the negative occurrences of the existential relations. 
We define !Nf to be the set of r'-structures containing for each clause -i(Li A • • • A Lm) in the 
canonical database [11] of (Li A • • • A Lm). We shall use the fact that a r'-structure 5* satisfies 
a clause -i(ii A • • • A Lm) if and only if the the canonical database of (Li A • • • A L^) is not 
homomorphic to S. 

We can assume without loss of generality that is minimal in the sense that if we remove a 
literal from some of the clauses the formula obtained is inequivalent. We shall show that then all 
structures in are connected. Let us suppose that this is not the case. Then there is a clause C in 
that corresponds to a non connected structure in Ji. The clause C can be written as ^{E A F) 
where the set X of variables in E and the set Y of variables in F do not intersect. Consider 
the formulas <Pe and #f obtained from <P by replacing C by -^E and C by ^F, respectively. By 
minimality of ^ there is a structure Me that satisfies but not <1>e, and similarly there exists 
a structure Mp that satisfies $ but not By assumption, the disjoint union M of Me and 
Mp satisfies ([>. Then there exists a T"-cxpansion M" of M where r" = r U {Pi, . . . , Pk} that 
satisfies the first-order part of Consider the substructures M'^ and Mp of M" induced by the 
vertices of Me and Mp. We have that M'^ does not satisfy the first-order part of <1>e (otherwise 
Me would satisfy ^e)- Consequently, there is an assignment s_e of the universal variables that 
falsifies some clause. This clause must necessarily be -^E (since otherwise M" would not satisfy 
the first-order part of <?) . By similar reasoning we can infer that there is an assignment sp oi the 
universal variables of to elements of Mp that falsifies -iF. Finally, fix any assignment s that 
coincides with sp over X and with sp over Y (such an assignment exists because X and Y are 
disjoint). Clearly, s falsifies C and M does not satisfy a contradiction. Hence, we shall assume 
that every structure in Tsf is connected. 

Then Theorem 6 asserts the existence of a K-free w-categorical r'-structure A that is universal 
for all K-free structures. We use A to define the template F for the constraint satisfaction problem. 
To do this, restrict the domain of A to those points that have the property that either Pi or P/ 
holds (but not both Pj and P/) for all existential monadic predicates Pj. The resulting structure 
A' is non-empty, since the problem defined by ^ is non-empty. Then we take the reduct of A' 
that only contains the input relations from r. It is well-known [28] that reducts and first-order 
restrictions of w-categorical structures are again w-categorical. Hence the resulting r-structure F 
is w-categorical. 

We claim that an r-structure 5* satisfies if and only if 5 G CSP(P). Let S be an structure 
that has a homomorphism h to F. Let S' be the r'-cxpansion of S such that for each i = 1, . . . ,k 
the relation Pi{x) holds in S' if and only if Pi{h{x)) holds in A' , and P/(a;) holds in S' if and 
only if P[{h{x)) holds in A' . Clearly, h defines a homomorphism from S' to A' and also from 5* 
to A. In consequence, none of the structures from X maps to S' (otherwise it would also map to 



A). Hence, the r"-reduction of 5" satisfies all the clauses of the first-order part of and hence S 
satisfies 

Conversely, let 5 be a structure satisfying ^. Consequently, there exists a r'-expansion S' of S 
that satisfies the first-order part of <!>' and where for every element x exactly one of Pi{x) or -P/(a;) 
holds. Clearly, no structure in K is homomorphic to the expanded structure, and by universality 
of r the r'-structure S' is an induced substructure of A. Since for every point of S' exactly one 
of Pi an P/ holds, S' is also an induced substructure of A'. Consequently, S is homomorphic to 
its r-reduct F. This completes the proof. □ 

In particular, we proved the following. 

Theorem 8. Every problem in (l,k)-Datalog that is closed under disjoint unions can be formu- 
lated as a constraint satisfaction problem with an oj- categorical template. 

For a typical example of a constraint satisfaction problem in MMSNP that cannot be described 

with a finite template [37] and that is not in {I, fc)-Datalog for all 1 < / < fc, consider the following 
computational problem. Given is a finite graph S, and we want to test whether we can partition 
the vertices of S in two parts such that each part is triangle- free. The w-categorical template that is 
used in the proof of Theorem 7 consists of two copies Ci and C2 of ^, where wc add an undirected 
edge from all vertices in C\ to all vertices in C2. The corresponding constraint satisfaction problem 
is NP-hard [1,25]. 

6 Bounded width 

We characterize the w-categorical templates whose constraint satisfaction problems have bounded 
width. The results generalize algebraic characterizations of Datalog width that arc known for 
constraint satisfaction with finite templates. However, not all results remain valid for infinite 
templates: It is well-known [21] that the constraint satisfaction of a finite template has Datalog 
width one if and only if the so-called arc- consistency procedure solves the problem. This is no 
longer true for infinite templates. We characterize both width one and the expressive power of the 
arc-consistency procedure for infinite w-categorical templates, and present an example that shows 
that the two concepts are different. We also present an algebraic characterization of strict width 
/. Note that width one and strict width I are the only concepts of bounded Datalog width that 
are known to be decidable for finite templates. 

6.1 Width zero 

An example of a template whose constraint satisfaction problem has width is the universal 
triangle- free graph Since there is a primitive positive sentence that states the existence of a 
triangle in a graph, and since every graph without a triangle is homomorphic to f!], there is a 
Datalog program of width that solves CSP(fi). In general, it is easy to sec that a constraint 
satisfaction problem has width if and only if there is a finite set of obstructions for CSP(r'), i.e., 
a finite set K of finite r-structures such that every finite r-structure A is homomorphic to F if and 
only if no substructure in 3\r is homomorphic to A. Since the complement of this class is closed 
under homomorphisms, we can apply Rossman's theorem and obtain the result that a constraint 
satisfaction problem with an arbitrary infinite template has width if and only if it is first-order 
definable [39] . For finite templates a characterization of first-order definable constraint satisfaction 
problems was obtained in [35] building on work in [2,38]. Our discussion can summarized by the 
following theorem. 

Theorem 9. For every infinite (and not necessarily w- categorical) template F the following is 
equivalent. 

— CSP{r) is first- order definable; 

— CSP{r) has a finite obstruction set; 



— CSP{r) has Datalog width 0. 



Moreover, if CSP{r) is first-order definable we can always find an lu- categorical structure F' that 
has the same constraint satisfaction problem as F. 

Proof. The equivalences have been discussed above and essentiahy follow from Rossman's theorem. 
The last remark is a special case of Theorem 8. □ 

6.2 Width one 

Let 7^ be an oj-categorical structure with relational signature t, and 77 be the canonical 
Datalog program for F. By Theorem 8, the class of r-structures accepted by 77 is itself a constraint 
satisfaction problem with an w-categorical template. We denote this template by r{l,k). 

Theorem 10. Let F be uj -categorical. A constraint satisfaction problem CSP{F) can be solved by 
an (1, k)-Datalog program if and only if there is a homomorphism from F{1, k) to F. 

Proof. The constraint satisfaction problem for F has width (1, k) if and only if the canonical (1, k)- 

Datalog program 77 solves CSP(7^), by Theorem 5. Since CSP(7^) is closed under disjoint unions, 
we can apply Theorem 8 and know that CSP(7^) equals CSP(7^(1, k)). Therefore, Lemma 2 implies 
that CSP(-r) has width (1, k) if and only if there is a homomorphism from 7^(1, k) to F. □ 

6.3 Arc-consistency 

The arc- consistency procedure (AC) is an algorithm for constraint satisfaction problems that is 
intensively studied in Artificial Intelligence (which is sometimes also called hyper-arc consistency 
or generalized arc consistency to stress the fact that it can also deal with constraints of arity larger 
than two). It can be described as the subset of the canonical Datalog program of width one that 
consists of all rules with bodies containing at most one non-IDB. For finite templates T it is known 
that the arc-consistency procedure solves CSP(T) if and only if CSP(T) has width one [21]. For 
infinite structures, this is no longer true: consider for instance CSP(fi), which has width 0, but 
cannot be solved by the arc-consistency procedure. The reason is that the width one canonical 
Datalog program for ^ has no non-trivial unary predicates, and we thus have to consider at least 
three relations in the input to infer that the input contains triangle. 

Theorem 11. Let F be an lo -categorical relational structure. If CSP{F) is solved by the arc- 
consistency algorithm, then F is homomorphically equivalent to a finite structure. 

Proof. Let r be the signature of 7^. Since 7^ is w-catcgorical, there are only a finite number of 
primitive positive definable nonempty sets Oi, . . . , 0„. We define the orbit structure of 7^, which is 
the finite relational r-structure with domain {Oi, . . . , 0„} where a fc-ary relation R from r holds 
on Oij. iff for every vertex iij in the orbit O^^. there are vertices ui, . . . , Vj+i, . . . , Ufe 

from Oi^ , . . . , Ojj_i , Oj^+i , • • • , Oi^ , respectively, such that 7? holds on vi, . . . ,Vk in F. 

We shall prove that for every instance S of CSP(7^) the following two statements are equivalent: 

1. The arc-consistency procedure for 7^ does not derive false on instance S 

2. S is homomorphic to the orbit structure of 7^. 

Proof of the Claim: 

(1) — > (2). Every unary relation that can be inferred by the arc-consistency procedure is 
definable by a pp-formula and hence is an element of {Oi, . . . , On}. For every variable u of S, let 
T" be the subset of {Oi, . . . , 0„} containing all those unary IDBs Oi, 1 < i < n such that Oi{u) 
is derived by the arc- consistency algorithm. By the structure of the rules of the arc-consistency 
algorithm T" is closed under intersection. Define h to be the mapping from Ds to {Oi, . . . , 0„} 
that maps every variable u to the minimum element of T", which will be denoted by flT". We shall 
show that /i is a homomorphism from S to the orbit structure of 7^. Let R € t, let (ui, . . . , Ufe) 



be a tuple of and let (nT"i , . . . , nT"'=) be its image according to h. Fix any j = 1, . . . ,k and 
let Oij be the set containing all those Vj such that there are vertices vi,. . . ,Vj-i,Vj+i, ■ . ■ ,Vk 
from , • • • , Oij_-^ , Oi-j_^ , . . . , Oi^ . respectively, such that R holds on v\,. . . ,Vk in F. By the 
construction of the arc-consistency algorithm there is the rule 

Oi^{xj) :- R{x^, . . . ,a;j),nT"i(a:i), . . . , nT"^-^(a:,_i), nT"^+i(a;,+i), . . . ,T"'=(a;;t) 

which would allow to derive Oi. (uj). As nT"^ C Oi^ we conclude that (nT"i, • • • , nT"'=) belongs 
to the relation R in the orbit structure. 

(2) — > (1). Let /i be a homomorphism from 5 to the orbit structure of F. It is easy to prove 
by induction on the derivation order that h{u) C R for every R{u) derived by the arc- consistency 
algorithm for F on instance S. Hence the goal predicate cannot be derived. 

We will now show that the arc-consistency procedure correctly decides CSP(r') if and only 
if the orbit structure is homomorphic to F. Since the orbit structure homomorphically maps to 
itself, the claim proven above shows that the arc-consistency procedure does not derive false on 
the orbit structure. Hence, if the arc-consistency procedure correctly decides CSP(r') then the 
orbit structure homomorphically maps to F. 

Conversely, suppose that there is a homomorphism h from the orbit structure to -T. To show 
that the arc-consistency procedure correctly decides CSP(-r), it suffices to show that an instance 
S where the arc-consistency procedure does not derive false homomorphically maps to F. By the 
claim proven above there is a homomorphism g from S to the orbit structure of F. Composing g 
and h yields the desired homomorphism from S to F. 

Finally, we show that there is always a homomorphism from F to its orbit structure. Observe 
that for every vertex v of F the intersection Oy of all elements in {Oi, . . . , 0„} that contains v is 
also a member of this set, because the class of pp-definable formulas is closed under intersection. 
Wc claim that the mapping / that sends every vertex t; of r" to Oy defines an homomorphism 
from F to its orbit structure. 

Indeed, suppose that {vi,. . . ,Vk) are in R'" for a fc-ary R& t. Let {Ot^ . . . , 0,^ ) be the image of 
(7;i, . . . , Vk) under /. For every j = 1, . . . ,k, let Pj be the relation defined over F by the following 
formula with free variable Xj . 

3a;i , . . . , Xj-i, Xj+i Xk.R{xi , . . . , Xfe) A (xi) A • • • A Oj^_i {xj-i) A Oj^+j {xj+i ) A- ■ ■ AOi^{xk) 

Since vj € Pj and Pj is pp-definable, O,^ C Pj. Hence, R{Oi^ . . . , O,^) holds in the orbit struc- 
ture. Therefore, the arc- consistency algorithm solves CSP(i^) if and only if F is homomorphically 
equivalent to its orbit structure. □ 

7 Bounded strict width 

The notion of strict width was introduced for finite domain constraint satisfaction problems by 
Fedcr and Vardi [21], and was defined in terms of the canonical Datalog program. In the termi- 
nology of the constraint satisfaction literature in Artificial Intelligence, strict width I is equivalent 
to 'strong I- consistency implies global consistency'. Based on our generalization of the concept 
of canonical Datalog programs, we study the analogously defined concept of strict width I for 
w-categorical structures. 

The notion of strict width is defined as follows. Recall that the canonical {I, A;)-Datalog Program 
n for CSP(F) receives as input an instacc S of CSP(-r) and returns an expansion S' of S over 
r' where r' is the vocabulary that contains all predicates r as well as a predicate for every IDB 
of n . The structure S' can be seen as an instance of CSP(F') where F' is the expansion of F 
in which every IDB R is interpreted as the at most l-axy primitive positive relation associated 
to it. The instance S' is called globally consistent, if every partial homomomorphism, i.e, every 
homomorphism from an induced substructure of S' to F, can be extended to a homomorphism 
from S to F. If for some /c > ^ -|- 1 > 3 all instances of CSP(r') that are computed by the canonical 
{I, A;)-program are globally consistent, we say that F has strict width I. Note that strict width I 



implies width I, and hence CSP(/^) can be solved in polynomial time when F has bounded strict 
width. 

In this section we present an algebraic characterization of strict width I for ^^-categorical 
templates F. The algebraic approach rests on the notion of polymorphisms. Let F he a relational 
structure with signature r. A polymorphism is a homomorphism from J'' to F, for some /, where F^ 
is a relational r-structure defined as follows. The vertices of arc ^-tuples over elements from Vr, 
and k such Z-tuples {v\, . . . ,vl), 1 < i < I, are joined by a k-aiy relation R from r if {vj, . . . ,Vj) 
is in , for all 1 < j < I. 

Wc say that an operation / is a near-unanimity operation (short, nu-operation) if it satisfies 
the identities /(a;, . . . ,x,y,x, . . . ,x) = x, i.e., in the case that the arguments have the same value 
X except at most one argument, the operation has the value x. We say that / is a near-unanimity 
operaiion on A if it satisfies the identities /(.t, . . . , x,y, x, . . . , x) = x for all x,y E A. 

Feder and Vardi [21] proved that a finite template F has a I + 1-ary near- unanimity operation 
(in this case, they say that F has the {I + 1) -mapping property) if and only if CSP(r') has strict 
width I. Another proof of this theorem was given in [29]. It is stated there that the proof extends 
to arbitrary infinite templates, if we want to characterize bounded strict width on instances of 
the constraint satisfaction problem that might be infinite. However, we would like to describe the 
complexity of constraint satisfaction problems with finite instances. 

In fact, there are structures that do not have a nu-operation, but where F has bounded strict 
width. One example for such a structure is the universal triangle- free graph f5. A theorem by 
Larose and Tardif shows that every finite or infinite graph with a nu-operation is bipartite [36]. 
Since the universal triangle-free graph contains all cycles of length larger than three, it therefore 
cannot have a nu-operation. However, the universal triangle-free graph has strict width 2. Indeed, 
for any instance S accepted by the canonical (2, 3)-Datalog program, every partial mapping from S 
to satisfying all the facts derived by the program (and in particular not containing any triangle) 
can be extended to a complete homomorphisms from 5 to - this follows from the extension 
properties of the template. 

Theorem 12 characterizes strict width I, I > 2, for constraint satisfaction with w-categorical 
templates. We first need an intermediate result. 

Lemma 3. Let F be a r-structure such that CSP{F) has strict width I and let t= be the superset 

of T in which we add, a, new binary relation symbol P=. Let F= be the t= -expansion of F in which 
P= is interpreted by the usual equality relation {{x,x) \x G Dr}- Then CSP{F=) has also strict 
width I. 

Proof. Let LI= be the canonical {I, fc)-program for CSP(i^=), and S= be an instance of CSP(r'=). 
Let SL be the structure computed by 77= on 5*= . Let 9 be the binary relation on the universe of 
S= defined to be the refiexive and transitive closure of P== . 

Let Se be the r-reduct of S=, in which we identify all vertices that are ^-related into a single 
element. More precisely, the universe of Sq are the classes of 9, {9a \ a E 7^5=}, where 9a denotes 
the ^-class of o, and for every R & t, say r-ary, R^^ = {(^au • • ■ , ^a^) | (oi, . . . , flr) € R^=}. We 
now consider Se as a CSP(7^) instance. Let 77 be the canonical {I, fc)-program of 7^. It is easy to 
prove by induction on the derivation order that if 7? is an IDB, say r-ary, and R{6ai, • • • ,^or) is 
derived by LI on Se, then 7?(ai, . . . , a^) is derived by 77= on S=. 

We have to show that S'= is globally consistent. So suppose that there is a partial homomor- 
phism h from S'= to F=. Since I > 2 and A: > 3, 77= will be able to derive that all elements in the 
same ^-class have to get the same value and hence if /i is a partial homomorphism this implies 
that for all elements a, b in the domain of h that are ^-related, h{a) = h{b). Define he to be the 
partial mapping that maps every 9a with a in the domain of Ji to h{a). By the definition of S= 
and analysis on the facts derived by 77 on S= carried out above we know that he is a partial 
homomorphism from Se to F. Hence h can be extended to a full homomomorphism h' from Se to 
7^. Finally, the mapping h' defined to be h'{a) = h'Q{6a) is a homomorphism from S= to 7^ and 
hence also from S'^ to F'. □ 

The proof of the following theorem is based on ideas from [21] and [29]. 



Theorem 12. Let F he an ui -categorical structure with relational signature r of bounded maximal 
arity. Then the following are equivalent, for l>2: 

1. CSP{r) has strict width I. 

2. For every finite subset A of F there is an I + 1-ary polymorphism of F that is a nuf on A. 

Proof. We first show that (1) impHes (2). 

We assume that CSP(P) has strict width and prove that for every finite subset A oi F there 
is a polymorphism of F that is a / + 1-ary nuf on A. Let be the superset of t in which we 
add a new unary relation symbol Ra for each element a in A, let F^ be the r''^-cxpansion of F in 
which Ra in interpreted by the singleton relation {a}, let A be Consider the set B of tuples 

(ao, . . . , az) in A' that have identical entries Ui = a except possibly at one exceptional position. Let 
A-^ be the r^-expansion of A obtained by placing in Ra all tuples {a, . . . , a,b, a, . . . , a) in B whose 
majority element is a. Every homomorphism from A-^ to F^ is by construction a polymorphism of 
F that is a nuf on A. Lemma 2 shows that if every finite substructure of A-^ homomorphically 
maps to F^, then A"^ homomorphically maps to F"^ as well. 

Let S"^ be any finite substructure of A"^, and let S be the r-reduct of which we see as an 
instance of CSP(-r). We shall show that there exists an h homomorphism from S to F that sends 
every element {a, . . . ,a,b,a, . . . ,a) in B fl Ds to its majority element a. Hence h will also define a 
homomorphism from to F^. 

Let T= be the superset of r in which we add a new binary predicate P= and let F= be the 
expansion of F in which P= is interpreted as the equality relation. Let S= be the following t= 
structure: The universe of S= is Ds x {0, 1}. For every predicate P € t, say r-ary, we define P^= 
to be P^ X {(0, . . . , 0)}. Furthermore 

^1" = {((«, 0), (a, 1)) \aeS}. 

By Lemma 3, F= has strict width I. Hence, there exists some k such that all instances of 
CSP(r'=) computed by the canonical (Z, fc)-program 77 are globaly consistent. Let SL be the 
instance computed by 77 on S= (which we recall has an expanded vocabulary tL). Now con- 
sider the partial assigmcnt g defined on (73 H Ds) x {1} that sends every tuple of the form 
((a, 1), . . . , (a, 1), (b, 1), (a, 1), . . . , (a, 1)) to a. We shall see that g is a partial homomorphism from 
SL to FL; Indeed, let (ai, . . . , a^) S 7?'^= be any tuple entirely contained in the domain of g. For 
every 1 < .7 < r, aj is of the from {{a,j, 1), . . . , {aj. 1), (hj, 1). {cij, 1), . . . , {aj, 1)). This tuple has 
necessarily been placed there by the Datalog program, and hence 7? is an IDB and has eardinaly 
at most I. The pigeon-hole principle guarantees that there exists an index index i,l<i<l + l 
such that for every 1 < j < r, the ith entry of aj is precisely {aj, 1). Since the ith projection is a 
homomorphism from S to 7^, it cannot violate any fact derived by the canonical (/, A;)-Program and 
hence (ai, . . . , a;) G 7?^= . Since 5*= is globally consistent this implies that g can be extended to a 
full homomomorphism g' from S'^ to FL . Finally we obtain the desired homomorphism h : S —>■ Dp 
as h{ai, ...,ai)= g'{{ai,0), (a;, 0)). 

Next we show that (2) implies (1). Let S be an instance of CSP(7^) such that the canonical 
(I, fc)-program 77 does not derive false on S, where k is larger than the maximal arity of the 
relations in t, and at least I + 1. Let F' be the expansion of 7^ by all at most Z-ary primitive 
positive definable relations, and let S" be the instance of CSP(r') computed by 77 on the instance 
S. We shall prove that we can extend every homomorphism h from an induced substructure K of 
S' to another element of S' . We prove this by induction on the number i of elements in K. Let 
Vi, . . . ,Vi be the elements of K, and let Vi+i be an element of S' that is not in the domain of h. 

For the case that i < I, let Rj{vj),j € J, be the set of all predicates in S' containing only 
elements from {vi, . . . ,Vi, Vi+i}, and let 7? be the IDB associated to the formula 3vi+i Aje./ ^ji^j) 
with free variables vi,. . . ,Vi (here we view vi, . . . , Vi+i as variables of the formula) . Since Rj {vj ),jG 
J are derived by 77, the predicate R{vi, . ■ . ,Vi) is also derived. Since h preserves this predicate, 
there exists some way to extend h such that it also preserves Rj{vj) for all j G J. 

For the inductive case where i > I + 1, select elements Wi, . . . , in {ui, . . . , Wi}, and let 
hj be the restriction of h in which Wj is undefined, for j e {1, . . . , / By induction, each of the 



mappings hj can be extended to a homomorphism hj from the structure induced by vi, . . . , Vi-^-l 
in S' to r'. Let A be a set containing the images of all functions hp and let g he a. {I + l)-ary 
polymorphism of F' that is a nuf on A (observe that F and F' have the same polymorphisms). 
Define b to be g{h'i{vi-\.i), . . . , h'i_^-^^{vi+i)) . We claim that the extension h' of h mapping Vi+i to b 
is a homomorphism from the structure induced by {vi, . . . , Uj+i} in S'. 

First wc shall see that h' preserves all IDBs. Let B.{ui, . . . , Ur), r < Z, be any predicate derived 
by n with ui,...,Ur G {vi, . . . ,Vi+i}. Define aj to be h'{uj) so that we have to prove that 
(ai, . . . , Or) belongs to R^' . For each j = 1, . . . , Z + 1 we construct an r-tuple V = {b{, ■ ■■ ,bi) in 
i?^ in the following way: if Vj is not in {ui, . . . , Ur}, then we set b^ to hi{vt) for f e {1, . . . , r}. 
Otherwise, we consider the restriction of h'j to {ui, . . . ,tt;} \ {vj}. By induction hypothesis this 

restricted mapping can be extended to a mapping hj defined for all {ui, . . . ,ui}. We define 6j 
to be h'!{ut) for all i e {1, . . . , r}. We claim that the tuple if{b\, . . . ,b[+^), . . . , f{bl, bl+'^)) 
is indeed (ai, . . . , a;). The reason is that at least I of the elements b},. . . ,b\'^^ are equal to at- 
Because 5 is a near-unanimity operation on A, we have g{bl, ■ . ■ ,bl) = ai. Hence, by the definition 
of polymorphisms, this tuple belongs to R^ . 

It remains to show that h' preserves all EDBs. Let R{ui, . . . ,Uj.) be any tuple initially in S. 
Let us denote h'{uj) by aj so that we have to prove that (ai, . . . , a,.) G R^ . We shall show that for 
every 7 C {1, . . . , r} there is a tuple (61, ... , br) in R^ such that bi = ai for all i € 7, by induction 
on |7|. 

For the case |7| < Z, let ii < • • • < ii' be the elements of 7, let 7?' be the IDB associated to the 
formula 3i^iViR{vi, . . . ,Vr) with free variables , . . . , u^/ . Since R'{vi^ , . . . , Vi^,):—R{vi, . . . ,Vr) is 

a rule of n and, as we have seen that (aj^ , . . . , a^^, ) £ R' , the claim follows. For the inductive 
case |7| > I, select I + 1 elements ii, . . . , ij+i of 7 and consider the tuple V = {b\, . . . , b^) given 
inductively for 7 \ {ij}. Let g be an / + 1-ary polymorphism which is a nuf on the set containing 
all elements in all tuples bj for j = 1, . . . , r. The tuple {g{bi, . . . , bj_^-^^), . . . , g{bl, . . . , 6[^i)) satisfies 
the claim. □ 

Note that in several papers including [4, 5] and the conference version that precedes this one, 
condition (2) has been stated in a different but essentially equivalent way using the notion of quasi 
near- unanimity operation.^ 

We say that an operation / is a quasi near-unanimity operation (short, qnu- operation), if it 
satisfies the identities f{x, . . . ,x,y,x, . . . , x) = f{x, . . . ,x), i.e., in the case that the arguments 
have the same value x except at one argument position, the operation has the value f{x,...,x). In 
other words, the value y of the exceptional argument does not influence the value of the operation 
/. Several well-known temporal and spatial constraint languages have polymorphisms that are 
qnu-operations [5]. 

For every subset A of 7^, we say that an operation is idempotent on A if /(a, .... a) = a for all 
a £ A. Hence, if a qnu-operation / is idempotent on the entire domain, then / is a near-unanimity 
operation. If a polymorphism f oi F has the property that for every finite subset A oi F there is 
an automorphism a of 7^ such that f{x, . . . ,x)) — a{x) for all x £ A, we say that / is oligopotent. 

Theorem 13. Let F be a ui -categorical structure with relational signature t of bounded maximal 
arity and let I > 2. Then the following are equivalent: 

1. CSP{F) has strict width I. 

2. For every finite subset A of F there is an I -\- 1-ary polymorphism of F that is a nuf on A. 

3. F has an oligopotent I + 1-ary polymorphism that is a qnu-operation. 

4- Every primitive positive formula is in F equivalent to a conjunction of at most l-ary primitive 
positive formulas. 

^ In the conference version of this paper, these operations were called weak near-unanimity operations. 
However, since another similar but much weaker relaxation of near-uanimity operations was introduced 
recently in universal algebra as well, we decide to call our operations quasi near-unanimity operations. 



Proof. The equivalence of (1) and (2) has been shown in Theorem 12, and the equivalence of (2) 
and (3) follows from a direct application of Lemma 2. The equivalence of (3) and (4) is shown 
in [4]. □ 



Concerning the condition of oligopotcncy in statement (3) of Theorem 13, we want to remark 
that for every w-categorical structure F there is a template that has the same CSP and where 
all polymorphisms are oligopotent. It was shown in [25] that every w-categorical structure is 
homomorphically equivalent to a model- complete core A, i.e., A has the property that for every 
finite subset A of the domain of A and for every endomorphism e of Z\ (an endomorphism is a 
unary polymorphism) there exists an automorphism a of A such that a{x) = e{x) for all x £ A. 
(Moreover, it is also known that A is unique up to isomorphism, and w-categorical.) 

Corollary 2. Suppose that A is an uj-categorical model-complete core. Then A has strict width I 
if and only if A has an I + 1-ary qnu-polymorphism. 
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